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ABSTRACT 

Current efforts to take advantage of the special 
virtues of the computer as an aid in text analysis are described. 
Verbal constructs, category construction, and contingency analysis 
are discussed and illustrated. Mechanical techniques for reducing 
human labor when studying large quantities of verbal data have been 
sought at an increasing rate by researchers in the behavioral 
sciences. Whatever the purpose of research, if it is to have a 
scientific character, ti must involve an attempt to reduce natural 
language data, by formal rules, to measures reflecting theoretically 
relevant properties of the text, its source, or its audience effects. 
At the present time, there is no one theory or method dominating the 
field of natural language analysis. Although much work is currently 
being expended to implement a finite set of rules on the computer , 
little has been accomplished that is directly useful to researchers 
in the social sciences. (Author/CK) 
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On +ho Uses of tho Computer 
For Content Analysis in Educational Research* 

• 

Jack H. Hiller, Southern Illinois University 

Gerald A. Fisher^ Illinois Institute of Technology^ Chicago 

Donald R. Marcotte; Wayne State University, Detroit 

Abstract 

Mechanical techniques for reducing human labor when studying large quanti- 
ties of verbal data have been sought at an increasing 'rate by researchers in 
the behavioral sciences. Research interests may range from attempts to simulate 
human judgemental performance, as when the computer is programmt;d to grade stu- 
dent compositions, to efforts to construct and test theories of verbal behavior, 
to exhiustive searches after empirical relationships in textual data. V/hatever 
the purpose of the research, if rt is to have a scientific character, then it 
must Involve an attempt to reduce natural language data> by formal rules, to 
measures reflecting theoretically relevant properties ofthe text, its source, 
or its audience effects. 

At the present time, there is no one theory or method dominating the field 
of natural language analysis. Different questions seem to lend themselves to 
different analytic techniques • The elegant mathematical models of Chomsky and 
others carry v/ith them the promise that a natural* language, is built and employed 
In accordance with a finite set of r jles. Although much v/ork Is currently being 
expended to implement such rules on the computer, little has been accomplished 
that is directly useful to researchers in the social sciences. 

In this paper, the authors describe current efforts to take advantage of 
tho special virtues of the ccoiputer as an aid in text analysis. In particular, 
verbal constructs, category construction, and contingency s-^jnalysis are discussed 
and Illustrated with recent investigations. 



The underlying basis of all current methods of computer assisted content 
analysis is the process of direct comparison. Words, or letters, senfences, or 
punctuation marks preselected by the r'esearcher are simply matched against text 
data to determine occurrence rates for those. units (Stone, Dunphy, Smith, and 
Ogilvle, 1966), or to discover the contextual associates of the pre-selected 
units (Laffal; 1965). Alternatively, an internal comparison of textual units 
(e.g., words or Idions) not previously specified by the researcher may be per- 
formed to compile v/ord or concept frequency tables (Sedelow and Sedelow^ 1958; 
Carlson, 1966), or to compile word cluster lists by the use of factor analysis 
(Iker and Harway, 1969). 

Two requirements are set for the units of analysis: they must be capable 
of being recognized by a suitably programmed computer, and they must also have 
relevance to theory or the goal of research. Let us stress that theory is a 
necessary guide to the selection of the units of analysis, and that theory, 
whether profound or trivial, is yet the^work of the human. The computer in all 
such analyses acts merely as a clerk or calculator, but never as theorist. 
However, this description of the computer's role is not universally agreed to. 

Page and Paulus (1968), for example, had a computer programmed to count 
commas, periods, and classes of words, such as common words, to enable computer 
grading of essays. Based on the frequency statistics of such units in a stu- 
dent's essay, the computer printed out one of several previously stored phrases. 
In describing this computer activity, Page and Paulus (1968, page 162) asserted, 
"the computer begins to understand what it is told by the student and is able to 
Intelligently respond to him." To the contrary, the intelligence manifest by 
such non-adaptive computer programs reflects the understanding not of the compu- 
ter, but that of the theory reflected in the computer program^ 



In an explicitly stated theoretical approach to essay grading research, 
c^uall^t^es of wrl,ting described by stylists wore translated into exemplifying 
words and phrases, which the computer was programmed to identify (Hi I ler, Marcotte, 
and Martin^ 1969)* The frequency of these units In a set of essays v/as then 
jneasured by the computer and found to bear statistically significant correlations 
with, the writing quality grades independently assigned the essays by expert human 
graders. The research philosophy found here is, we believe, common to most con- 
tent analyses performed by computer. V^e, therefore, will discuss this research 
and its rationale in detail. 

NATURE OF TEXT ANALYSIS 

Any text forming the natural language data to be studied is exactly what it 
is. To rephrase Its meaning^ to reduce it to kerne! sentences, or to measure it . 
in some way or ways cannot in principle provide, the researcher^v/i th a truly equiv- 
alent. .representation of the original text. The very nature of analysis is the 
process of abstracting from the corporate whole characteristics of the text which 
are defined by theory. Again, the nature of such character i sti cs and the tech- 
niques employed to measure their appearance in text derive from theory. In attend! 
only to certain characteristics, others must necessarily lie unnoticed. Thus, anal 
ysi'S serves to reduce a text from Its unity^.on the one hand; and from its poten- 
tially Infinite reserve of characteristics, on the other, to a restricted set of 
measures. Vfe may say the text is reduced to Its theoretically Important essentials 
by analysis and measurement. 

The text characteristics measured have two uses: description and inference. 
Description is achieved by having the computer count units of direct interest; ■ 
for example, the number of sentences written, their average length, and the number 
of words in the text, or the frequency of specific words themselves of direct 



Interest, such as the word ^'the/' or the punctiaatlon comma. The other use Is 
directed to measurement of properties of text capable of leading to an Inference 
concerning the source of the te>ct^ or Its potential effects on soto audience, or. 
In fact, Its Informational content. Operationally, descriptive and infere.ntlal 
measures may be identical • For examp le, suppose we are testing the notion that 
long sentences (those over 20 words) typically produce comprehension difficulties. 
Until this hypothesis Is substantiated, long sentences would tentatively repre- 
sent the Inferred property — difficult style. Now assume that we accept the hy- 

« 

pothesis and are training students to write with an emphasis on reading ease, 
feedback to the student would be a description of his sentence lengths. 

STRATEG ! ES OF ANALYS IS 

Determination of a text's meaning by computer is currently impossible and 
yy'lll continue to be unmanageable even if parsers capable of automatically re- 
,ducing sentences to kernels are perfected. Assume, for example, that we wish 
to determine if a given text contains a certain message. An obvious fact of 
language is that utterances which bear the same sense with respect to some 
point of view may yet have different graphemic expression, "These are certainly 
the facts of life," and "These certainly are the facts of life," will for most . 
purposes be equivalent. Yet, if one sentence were the message to be sought in 
the text, and the other were in the text, a computer programmed to perform a 
simple direct word by word match of message to text would fai I to discover the 
message. A suitable parser would convert both message and text to kernel sen- 
tences which then might be effectively compared. However, it is apparent that 
messages comprized of more than a single sentence might not be effectively anal- 
yzed, even with the perfected sentence parser. Let us i 1 lustrate this point 
with two passages, each formed from identically worded sentences. 



pVEST LA VIE 

^One day by chance; Mary happoned to encounter John. Mary and John had a 
love affair. Lovely were the buds of Spring, and lovelier yet Summer's blooms^ 
But then,' as In all life's adventures. Summer turned to Fall, and Fall decayed 
Into Winter's death, Mary endured, despite the finality. On one light Spring 
day, fickle Mary met Peter, and promptly married him. Despite her Initial 
anxieties, she lived happily forever after, 

THE ADULTRESS 

One light Spring day, fickle Mary met Peter, and promptly married him. But 
then, as in all life's adventures, Summer turned to Fall, and Fall decayed into 
Winter's death. Mary f^ndured despite the finality. One day by chance, Mary 
happened to encounter John. Lovely v/ere the buds of Spring, and lovelier yet 
the blooms of Summer. Mary and John had a love affair. Despite her initial 
anxieties, she lived happily forever after, 

It may be seen here that sentence order provides a contribution so crucial 
to the meaning of these passages that a simple comparison of their sentences or 
kernels would falsely determine equivalence* This conclusion of false equtva-- 
lence might be refuted by requiring that matched sentences occur in identical 
order, However, any such restriction could also lead to an invalid result. For 
example,' the meaning of the second passage would not Importantly be affected if 
the order of the sentences, "Mary and John had a love affair," and "Lovely v/ere 
the buds of Spring were to be interchanged. , It is certain that analyses 

performed only at the level of the sentence, wjth all sentences treated In iso- 
lation, will often lead to inaccurate t'^xt analysis. 

Another approach to content analysis divorces itself from syntax. In this 
approach, the researcher limits the computer's task to a search for discrete 



textual unfts which have been prose lectcd to enable measurement of specified 
text characteristics. As stated earlier, such characteristics have two dis- 
tinguishable uses; namely, description and inference, 

A descriptive analysis Is performed simply by having the ccmputer Iden- 
tify and count text units* In general, the analyst's work in determining ap- 
propriate units for a given description is simple. A description of the "number 
of words" contained in a text clearly requires that words be selected as the 
unit of measure. (A minor problem arises, however, concerning treatment of hy- 
phenated words and Idiomatic expressions or phrases which offer a single sense 
through use of two or more written words. For example, in "John is a used car 
salesman," we may wish to treat "used car^" as a single idea and hence as a • 
single v.'ord). In the analysis, all written words are treated as an equivalence 
class — Ail V/ords — and all such equivalence classes are termed "categories." 

CATEGORIES 

A category is a list of items (e.g., words^ letters, punctuation, etc.) 
selected by the researcher to represent a text property. The items selected 
for inclusion in a category are chosen on the assumption that all possess a common 
feature. Thus, categor/ items m=\y be highly dissimilar along a number of dimen- 
sions, provided all possess the attribute defining the category. In addition to 
the category AM Vtords, we may also define Sentence Words and have the computer 
count tKe number of words In each text sentence, and In addition provide statis- 
tical descriptions, such as average sentence length or standard deviation of 
sentence length. Another category could be four letter words, five letter words, 
and so on» These particular examples of word categories do not reflect any no- 
tion of sense or moaning^ 

Perhaps the most common type of meaningful category is the synonym list. 



Another category type^ which relates to meaning but does not rely on synonimyty, 
Is the special purpose category, of which there are infinitely many possible. For 
example, we may define the special purpose category — color words — which would con- 
tain the items: aqua, blue^ green, etc., or the category — cars introduced in 
America during 1945 — and so on. Such categories as provided in the above examples 
offer the researcher no particular difficulty regarding the selection of items 
for category inclusion. The concepts underlying these categories rather directly 
Imp ly thei r i terns* 

CONSTRUCTS 

However, certain concepts of potential interest do not immediately suggest 
the category items. A concept or construct is an abstraction which attains meaning 
within a theory. As an abstraction, a construct cannot itself be directly observed. 
But any interesti ng ■ construct must perml^t prediction of observable behaviors re- 
lated to it. Anxiet/ for example, is a construct which psychologists have employed 
to explain aspects of speech. Mah i (1959), hypothesizing that speech disturbances 
could reflect anxiety, correlated counts of speech disturbances of patients with 
subjective listener ratings of their anxiety, and found the predicted relationship. ^eJT 
Mahl, in his study of speech disturbances^ fnr cxmipj^ encountered a validation 
problem. He noted that hi:^ listener ratings may have been contaminated by the ■ 
raters' knowledge of the speech disturbance hypothesis. Because of this, the test* 
of the hypothesis lacked validity. In general,- validation of the index designed 
to represent a construct constitutes the most important and difficult pha^e of 
research, 

0 Techniques for- va 1 IdaM ng the insxrument of measurement are illustrated In 

the following section by describing research In which validation of a particular 
construct, vagueness, and Its hypothesized catoqory representation (operational 
definition) were pursued. 



RESEARCH ON VAGUENESS: AN EXAMPLE OF CONSTRUCT VALIDATION 

'Vagueness ha^ Seen defined as a "psychological construct v/hich refers to 
the state of mind of a contDuni cator who does not sufficiently command the facts, 
knowledge or understanding required tor maximally effective communication/' 
(HM ler^ 1969). Vagueness is an internal stimulus condition which develops in a 
speaker or writer as he commits himself to deliver information he can't remember 
or simply doesn't know. It is assumed that the experienced communicator has 
learned a set of verbal responses which enable him to move on from his point of 
difficulty. Based on observations of verbal behavior, vagueness response cate- 
gories were formulated and their items sele:jted (Hi Mer, 1969). For example, 
one of the categories was termed "approximation" and among the items chosen as 
clues were: almost, about as, kinda, sorta, pretty much, etc. The complete 
set of categories is presented in figure !• 

It was hypothesized that writers who are vague in thinking would as a con- 
sequence, write less effectivei,. iet of 250 student essays was processed by 
.computer to measure the vagueness represented by each essay, and these measures 
were then correlated with essay grades provided by human judges v/ho knew nothing 
of the vagueness construct. Vagueness was found to correlate .-•26 with scores 
for essay Content, and —.32 with scores for Creativity (both correlations are 
significant at p <.0005; Hiller, Marcotte, and Martin, 1969). 

Insert Ficjure I. about here 
In a second test of the construct^ teachers' lectures were studied to i a-~ 
vestigate the relationship between lecturing effectiveness and vagueness. A 
measure of the teacher's lecturing effectiveness was obtained by administering 
a multiple choice test of lesson comprehension to the teacher's class irrmediately 
after, he had presented a 15 minute lecture. The complete set of teacher lectures 



which contained over lOO^OOO words, was transcribed fran video tape recordings 
and then koy-punched (Gago^ Be^lgard, Dell, Hi I ler^ Rosenshina, and Unruh, I97I), 
In one set of 32 lectures, Vagueness correlated -•59 (p<.005) with class tesr 
scores, and in a second set of the lectures -.48 {p<.05, H{ I ler, Fisher, Kaess, 
1969). 

In a subsequent experiment, knowledge of speakers was manipulated to deter- 
mine [f vagueness, as measured through use of the vagueners response dictionary 
(a dictionary is a set. of categories) is indeed related to the speaker's command 
of his subject matter. One group of speakers lis+ened to a tape recorded lesson 
for 15 minutes before presenting their own lectures based on the lesson just 
heard. A second group listened to the lesson after 50 percent of the tape was 
randomly replaced with fragments of a second lesson. The group with the mutilated 
tape displayed a greater use of the vagueness items, F (1,20) = 15, p<.OOI, as 
had been predicted (Hi Mer, 1969). A sample of a vague lecture is shown in Exhibit 

The experiment specifically inquired if items hypothesized to be clues to 
vagueness could be demonstrated to occur when a speaker is working from inade-- 
quate knowledge. However, the experiment, as such, does not and could not prove 
the theory true, Consider the chain of inference between experiment and theory: 
a speaker committed to address an audience on an informative topic discovers he 
does not know or understand the material which has to be communicated; this in- 
adequacy arouses an Internal stimulus condition; finally, responses previously 
performed and reinforced to this stimulus are evoked in the speaker. The internal 
stimulus condition — vagueness — is a hypothetical construct whose presence must!be 
inferred; other explanations for the experimental results may always be found. 

IMPROVING THE INSTRUMENT OF MEASUREMENT 

Although the fncasures of vagueness based on the items inciuded in the- vague- 
ness dictlonsry have related io validation criteria as had been predicted, the 
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validity and reMabMIty of each of tho individual Items in the dictionary or 
categories must yet be dealth with. Any given Item may on certain occasions, 
In certain language contexts, provide an erroneous indication. In addition, 
certain items may be associated with intense vagueness, that Is, a communicator's 
being completely befuddled, or with only slight unciertai nty , Other of the Items 
may quite simply have no validity as clues to vagueness. Since the vagueness 
dictionary contains 353 words and phrases, research to investigate alt of the I n-- 
dividual items could be overwhelming. However, given a suitable validation cri- 
teria, and giyen a text in which each item to be tested occurs with sufficient 
frequency, one manageable strategy Is +o split the text into two sections, one-^ 
to be used to conduct a pilot study, and the other to be used to cross validate 
the results of the pilot study. 

Selection of the items to be tested is a task at least as important as 
eventual validation tests. The theorist may himself select the items, or he may 
instruct a panel of judges to provide them. Different forms of instruction are 
possible. For example, the judges may be told of the construct; that Is, the 
construct definition may be presented and explained as the basis for having each 
Judge vo.lunteer category items. Judges may also be presented samples of verbal 
behavior to have them search for response items. Or the judges, after having 
had the construct explained to them, may be presented a list of items tentatively 
selected by the researcher. The judges may accept or reject items, or they may 
j'a+e Items for relevance or degree of construct representation in a manner similar 
to the semantic differential, The researcher may employ such ratings as item 
Weights when ca'culating construct representation in his data (see Holste, I969), 
Items w[1h low ratings might also be discarded from the computer search list to 
save computer timo, However, any ouch exclusion of weak items may jeopardize the 
. .validity of. a study, If a given writer under study happens to. prefer use of an 



Item not Included In tho category, but the Item happens to signify an Important 
property of the texf, then analysis, without It may yield Invalid results. 

We must also urge that computer counts not be accepted at face value,; but 
that any result be checked by human Inspection of the text under analysis* Most 
of the common words of our language cloak different meanings In similar spellings 
(the homograph prcblefrf; thus a simple match of category words against text may 
lead to invalid scoring. Stone (1969) reports that a set of procedures for 
resolving this ambiguity through use of context clues Is in preparation. But 
until such disambiguation procedures are avai table, a simple expedient Is to have 
•the computer print out each Item found in text along with Its context, to enable 
human checking. It shoufd also be noted that use of phrases rather than single 
words greatly improves scoring accuracy. For example, the word "kind'' may 
refer to "classification" or to "thoughtf u Iness ," but it also forms part of the 
phrase, "kind of" or "kinda". The word "kind" in the sense of cl assi f i rijt ion 
happens to be a vagueness item* While use of "kind" leads to many scoring errors, 
"kind of" is error free. V/ord combinations fix meaning perhaps surprisingly welL 
WORD CO-OCCURENCES AS CLUES TO MEANING 

In the preceeding section, it was suggested that phrases or co-occuring words 
may greatly improve the machine's accuracy over single words in scoring content. 
An interesting test of this proposal was conducted by comparing the two methods 
In a product simulation of the human content scoring of short-answer identification 
essays (Marcotte 1959). As part of a final examination in history, students were . 
required to respond to 12 terms such a Cluniac Movement, craft guild, Cicero, 
etc,, in a few ' i^entences that were to demonstrate their familiarity with the 
terms* The course professor provided his graduate assistant an answer key for 
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each of the 12 terms. For purposas of research, these keys v/ere then used by 
Judges to grade the students' responses. The keys v/ero also used as the basis 
for computer grading, as explained below. 

Each key was first Inspected to insure that It had simple sentences or 
phrases; compound ^ complex sentences whenever found v/ere transformed Into simple 
sentences.*' S^v*^ ^^^^^^^^^^^^ *'>^ovided the primary computer scoring units. To 
economize computer time, function words (the, a, of, etc.) were eliminated from 
the keys. ' The keys were then inspected word by word to determine where synonyms 
were needed, and with the aid of several synonym dictionaries, each word was 
.replaced by a category of synonyms with one or more words and phrases. Thus, each 
of the simple' sentences of a" key was recast as a set of one or more categories. 
Similarly, the student essays v;ere processed by computer so that, all words were 
erther \eplaced -rV^o^tegory markers or eliminated; the computer processing was 
also set to retain sentence units during the transformation of text to categories. 
Basically, the computer next matched the categories of each key with the categories 
of the student *s response, 

To be more specific, several scoring procedures were applied. In the simplest 

procedure, sentence organization in both the key and response was ignored and 

the" machi nc -^/^ the percentage of key categories found in the student's 

response (the single word frequency method). in contrast to this procedure, 

each sentence unit of the key was matched against each unit of the response; 

credit was given the response only if all categories in the key sentence were 

found In any one of ^he student's sentences (the narrow context co-occurence method). 

As an extensior. of Ihts method, the key sentence was also matched against the 

entire set of catogories found in any three contiguous sentences of the student's 

answer (wide context co-occurence method). Since it could be anticipated that 

the list of synony::!': provided for each word of th(? key might be incomplete, a 



technique compromising between the frequency and co-occurrence methods v/as also 
programmed. In this method, it was required that at least half the categories 
of a key sentence be found In the student's response for scoring to proceed; 
here the student earned a percentage score for the number of categories common 
to his response and the key divided by the total number of categories in the 
key sentence, with of course, a minimum non--zero score of 50^ (the threshold 
co-occurrence method). In practice, the threshold method was applied to single 
sentences (narrow context threshold) and to three contiguous sentences (wide 
context threshold). Thus five scoring techniques were programmed. The scores 
for the essays produced by the computer were then correlated with the criteria! 
human grades. 

The first finding to note Is that the requirement of complete co-occurrence 
was too stringent; most responses scored this way simply earned zero; in short, 
it did not permit discrimination necessary for grading. As regards the correl- 
ations between judge grade and scores generated by the remaining three methods, 
the results were quite good (See Table !). 

1'nsert Table I abour here 
The test of the adequacy of these computer scoring techniques appears to have 
been limited by the reliability of the human judges. It may be seen, for ex- 
ample, that the computer scores agree as well or better with the pooled judge 
grades than the judges agree with each other. Most important, the experimental 
hypothesis was supported by the finding that the threshold, wide context tech- 
nique for scoring co-occurrence correlated better with the criterion than the 
word frequency technique. The success of the co-occurrence measures must be 
Interpreted with respect to the kinds of answers required by the particular 
Identification terms forming the student test. Logical development of thought 
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was not roqulred for a correct answer to any of tr.e Identifications, and sinco 
the average essay contained only three sentences, there was Ifttle opportunity 
for categories to occur beyond the three sentence unit which had worked so 
well in this research. 

FISHING EXPEDITIONS 

In the research projects described above," the flow of activity proceeded 
from conceptualization of the problem, to the formulation of theory and con- 
structs, to -ihe construction of categories, and ultimately to a test of the 
empirical relationships deduced from the theory. In this process, the only 
virtue displayed by the computer was that of counting machine.- 

However, not all researchers believe that theory is the most useful guide 
to research. Page, for example, defended the research strategy he employed 
In the a .^mpt to grade essays by computer by arguing that, 'Mn genera!, 
prediction research would be unnecessarily and artificially restrained If it 
were not permitted use of any convenient predictors, regardless of the vague- 
ness of rationale for their inclusions. There were in this study a fair 
number of what might be cal led, therefore, **proxes of opportunity,'' (Page, 
and Paulus, 1968, page 25). (Page defines ''prox" as any measure which does 
not- provide direct information on a variable of i nterest, , but does approximate 
It; a list of coronon spelling errors, if used by the computer to measure 
errors in text, tvould be termed a prox). 

A fine illustration of the futility of such . a theoreti ca 1 research -is 
that of Del! and Miller (1971) who attempted to discover correlates of effective 
lecturing. At the time this study was conducted, dictionaries constructed 

0 

specifically for analyses of teacher lectures had not yet been key-punched. 
But the General Inquirer's Harvard Psych osoc i o I oai ca I Of ct i onar y^ M I was 
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available and on that basis employed to search for characteristics of teacher 
effect! veness p Unfortunately, some fish were caught. For example, '^Medical 
terms" demonstrated a correlation of -.SS; Cp^,05) with ef f ect i venesc 
lecture about Yugoslavia, and "Sex Themes" a correlatton of ,75. Having 
obtained these correlates, the investigators v/ere quite unable to contrive any 
productive explanations, The computer in such explorations merely absorbs 
resources better spent on constructive thinking. 
ConcI usions 

'The writers conducted a review of the literature pertaining to computer . 
applications in language analysis which failed to uncover any evidence even 
hinting that use of the computer contributed to the theoretical excellence 
of 'research employing it as an instrument of measurement and analysis (with 
the single possible exception being the research on stylistic analysis 
reported by Sedelow and Sedelow, 1967). Yet one frequently encounters the 
argument that use of the computer forces the researcher to make his theory 
explicit, since programming the computer for analysis absolutely requires 
a formal, exact statement of the analytic measures. Where this view accurately 
portrays a body of research, that research may well be described as trivial, 
teg, where the theory is stated to be a collect ion or list of search words). 
But worse, this view misrepresents the relationship of measurement to theory 
in permitting the measures (word counts) to go unexplained. In the case of 
having the computer grade essays by counting the number of commas, hyphens, 
colons, etc, the research has turned from concern with important character- 
istics of writing, which the computer cannot measure, to concern for measure-, 
ment of superficial statistical characteristics of text which the computer 
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can manage. Furthermore^ since no theory is explicated which relates counts 
based on certain cf these statistical aspects of text (hyphens, dashes, 
slashes, etc.) to the human judgemental process invoked v/hen writing quality 
.Is evaluated, it is clear that use of the computer does not insure that theory 
will be developed and clearly formulated. Use of the computer to spew 
correlations may thus lead the researcher to overlook significant features 
of the text to be studied. 

It Is our opinion that emphasis on the computer as an agent in automated 
content analysis is misplaced. Lest this opinion appear merely academic, 
having as its aim a straw-man, v/e may point to the fact that a new professional 
journal has appeared — Computer Studies in the Humanities and Verbal Behavior , 
We see three noteworthy dangers in such emphasis. One, the lay public may 
be enticed by the scientific appearance of computer research to grant its 
authors and results unfounded acceptance. Two, considerable resources may 
be wasted in pursuit of trivial correlations. Three, gradual public realiza- 
tion of some of the severe limita-^ ions of computerized content analysis may 
also produce an unjustified rejection of this forrn: of ; research , 
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FOOTNOTE ^ 



.An eaHIer version of thjs paper was present d t t,e Association for 
Computing Machinery National Conference, San Francisco, nug 



ERIC 



Category Number of Category Items 



Ambiguous designation (all of this, and things, 

somewhere; other people 61 

Negated intensifiers (not all; not many; not 

very) 57 

Approximation (about as, almost; pretty much) 29 

"Bluffing" and recovery (a long story short; 

anyway, as you all know, of course) 55 

Error admission (excuse me, not sure; maybe; 

I made an error) * 18 

Indeterminate quantification (a bunch, a couple^ 

few^ sane) 28 

Multiplicity (aspects^ factors, sorts, kinds) 36 

Possibility (may, might; chances are^ could be) 17 

Probability (probably, sometimes, ordinarily, 

often, frequently) 19 

Reservations (apparently, somewhat, seems^ 

tends) 33 

TOTAL 353 



FIG. I. VAGUENESS CATEGORIES. (The latest vagueness dictionary has been mod- 
ified to include common pronouns, eg. this, that, it, etc.). 
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Table I 

^^^^^^^^^^^^.^^ 

Median BB^Sf 

495 .30 to .61 

Frequency 

530 .21 to .71 

Threshold narrow 



.605 



.33 to .74 



Threshold wi de 

Average correlation of each ^ .53 

of 8 judges with the 7 others* .475 



* Fisher Z transforms were applied prior to averaging. 



